AITopics | wasserstein dro

Collaborating Authors

wasserstein dro

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DistributionallyRobustOptimizationwithData Geometry

Neural Information Processing SystemsFeb-12-2026, 06:18:05 GMT

It has been well documented that high dimensional data approximately resides on low dimensional manifolds.

artificial intelligence, correlation, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.04)
North America > Canada > Quebec > Montreal (0.04)
North America > Canada > Newfoundland and Labrador > Labrador (0.04)
(2 more...)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Representation-Aware Distributionally Robust Optimization: A Knowledge Transfer Framework

Wang, Zitao, Si, Nian, Liu, Molei

arXiv.org Artificial IntelligenceSep-12-2025

We propose REpresentation-Aware Distributionally Robust Estimation (READ), a novel framework for Wasserstein distributionally robust learning that accounts for predictive representations when guarding against distributional shifts. Unlike classical approaches that treat all feature perturbations equally, READ embeds a multidimensional alignment parameter into the transport cost, allowing the model to differentially discourage perturbations along directions associated with informative representations. This yields robustness to feature variation while preserving invariant structure. Our first contribution is a theoretical foundation: we show that seminorm regularizations for linear regression and binary classification arise as Wasserstein distributionally robust objectives, thereby providing tractable reformulations of READ and unifying a broad class of regularized estimators under the DRO lens. Second, we adopt a principled procedure for selecting the Wasserstein radius using the techniques of robust Wasserstein profile inference. This further enables the construction of valid, representation-aware confidence regions for model parameters with distinct geometric features. Finally, we analyze the geometry of READ estimators as the alignment parameters vary and propose an optimization algorithm to estimate the projection of the global optimum onto this solution surface. This procedure selects among equally robust estimators while optimally constructing a representation structure. We conclude by demonstrating the effectiveness of our framework through extensive simulations and a real-world study, providing a powerful robust estimation grounded in learning representation.

artificial intelligence, estimator, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2509.09371

Genre: Research Report (0.82)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Add feedback

Wasserstein Distributionally Robust Policy Evaluation and Learning for Contextual Bandits

Shen, Yi, Xu, Pan, Zavlanos, Michael M.

arXiv.org Artificial IntelligenceJan-17-2024

Off-policy evaluation and learning are concerned with assessing a given policy and learning an optimal policy from offline data without direct interaction with the environment. Often, the environment in which the data are collected differs from the environment in which the learned policy is applied. To account for the effect of different environments during learning and execution, distributionally robust optimization (DRO) methods have been developed that compute worst-case bounds on the policy values assuming that the distribution of the new environment lies within an uncertainty set. Typically, this uncertainty set is defined based on the KL divergence around the empirical distribution computed from the logging dataset. However, the KL uncertainty set fails to encompass distributions with varying support and lacks awareness of the geometry of the distribution support. As a result, KL approaches fall short in addressing practical environment mismatches and lead to over-fitting to worst-case scenarios. To overcome these limitations, we propose a novel DRO approach that employs the Wasserstein distance instead. While Wasserstein DRO is generally computationally more expensive compared to KL DRO, we present a regularized method and a practical (biased) stochastic gradient descent method to optimize the policy efficiently. We also provide a theoretical analysis of the finite sample complexity and iteration complexity for our proposed method. We further validate our approach using a public dataset that was recorded in a randomized stoke trial.

machine learning research, transaction, wasserstein dro, (12 more...)

arXiv.org Artificial Intelligence

2309.08748

Country: Europe > France (0.04)

Genre:

Research Report > Strength High (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

On Generalization and Regularization via Wasserstein Distributionally Robust Optimization

Wu, Qinyu, Li, Jonathan Yu-Meng, Mao, Tiantian

arXiv.org Artificial IntelligenceDec-12-2022

Wasserstein distributionally robust optimization (DRO) has found success in operations research and machine learning applications as a powerful means to obtain solutions with favourable out-of-sample performances. Two compelling explanations for the success are the generalization bounds derived from Wasserstein DRO and the equivalency between Wasserstein DRO and the regularization scheme commonly applied in machine learning. Existing results on generalization bounds and the equivalency to regularization are largely limited to the setting where the Wasserstein ball is of a certain type and the decision criterion takes certain forms of an expected function. In this paper, we show that by focusing on Wasserstein DRO problems with affine decision rules, it is possible to obtain generalization bounds and the equivalency to regularization in a significantly broader setting where the Wasserstein ball can be of a general type and the decision criterion can be a general measure of risk, i.e., nonlinear in distributions. This allows for accommodating many important classification, regression, and risk minimization applications that have not been addressed to date using Wasserstein DRO. Our results are strong in that the generalization bounds do not suffer from the curse of dimensionality and the equivalency to regularization is exact. As a byproduct, our regularization results broaden considerably the class of Wasserstein DRO models that can be solved efficiently via regularization formulations.

artificial intelligence, machine learning, wasserstein ball, (13 more...)

arXiv.org Artificial Intelligence

2212.05716

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.14)
Asia > China (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.30)

Add feedback

Sinkhorn Distributionally Robust Optimization

Wang, Jie, Gao, Rui, Xie, Yao

arXiv.org Machine LearningSep-24-2021

Decision-making problems under uncertainty have broad applications in operations research, machine learning, engineering, and economics. When the data involves uncertainty due to measurement error, insufficient sample size, contamination, and anomalies, or model misspecification, distributionally robust optimization (DRO) is a promising approach to data-driven optimization, by seeking a minimax robust optimal decision that minimizes the expected loss under the most adverse distribution within a given set of relevant distributions, called ambiguity set. It provides a principled framework to produce a solution with more promising out-of-sample performance than the traditional sample average approximation (SAA) method for stochastic programming [86]. We refer to [81] for a recent survey on DRO. At the core of DRO is the choice of the ambiguity set. Ideally, a good ambiguity set should take account of the properties of practical applications while maintaining the computational tractability of resulted DRO formulation; and it should be rich enough to contain all distributions relevant to the decision-making but, at the same time, should not include unnecessary distributions that lead to overly conservative decisions. Various DRO formulations have been proposed in the literature. Among them, the ambiguity set based on Wasserstein distance has recently received much attention [104, 67, 17, 46]. The Wasserstein distance incorporates the geometry of sample space, and thereby is suitable for comparing distributions with non-overlapping supports and hedging against data perturbations [46].

arxiv preprint arxiv, optimization, sinkhorn dro, (12 more...)

arXiv.org Machine Learning

2109.11926

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.67)

Industry:

Energy (0.46)
Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.66)

Add feedback

From Majorization to Interpolation: Distributionally Robust Learning using Kernel Smoothing

Zhu, Jia-Jie, Nemmour, Yassine, Schölkopf, Bernhard

arXiv.org Machine LearningFeb-16-2021

We study the function approximation aspect of distributionally robust optimization (DRO) based on probability metrics, such as the Wasserstein and the maximum mean discrepancy. Our analysis leverages the insight that existing DRO paradigms hinge on function majorants such as the Moreau-Yosida regularization (supremal convolution). Deviating from those, this paper instead proposes robust learning algorithms based on smooth function approximation and interpolation. Our methods are simple in forms and apply to general loss functions without knowing functional norms a priori. Furthermore, we analyze the DRO risk bound decomposition by leveraging smooth function approximators and the convergence rate for empirical kernel mean embedding.

algorithm, distributionally robust learning, optimization, (12 more...)

arXiv.org Machine Learning

2102.08474

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Finite-Sample Guarantees for Wasserstein Distributionally Robust Optimization: Breaking the Curse of Dimensionality

Gao, Rui

arXiv.org Machine LearningOct-30-2020

Wasserstein distributionally robust optimization (DRO) aims to find robust and generalizable solutions by hedging against data perturbations in Wasserstein distance. Despite its recent empirical success in operations research and machine learning, existing performance guarantees for generic loss functions are either overly conservative due to the curse of dimensionality, or plausible only in large sample asymptotics. In this paper, we develop a non-asymptotic framework for analyzing the out-of-sample performance for Wasserstein robust learning and the generalization bound for its related Lipschitz and gradient regularization problems. To the best of our knowledge, this gives the first finite-sample guarantee for generic Wasserstein DRO problems without suffering from the curse of dimensionality. Our results highlight the bias-variation trade-off intrinsic in the Wasserstein DRO, which balances between the empirical mean of the loss and the variation of the loss, measured by the Lipschitz norm or the gradient norm of the loss. Our analysis is based on two novel methodological developments that are of independent interest: 1) a new concentration inequality controlling the decay rate of large deviation probabilities by the variation of the loss and, 2) a localized Rademacher complexity theory based on the variation of the loss.

artificial intelligence, machine learning, optimization problem, (15 more...)

arXiv.org Machine Learning

2009.04382

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)

Genre: Research Report > New Finding (0.87)

Industry: Energy (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
(2 more...)

Add feedback